decomposed knowledge distillation
Decomposed Knowledge Distillation for Class-Incremental Semantic Segmentation
To this end, it is crucial to learn novel classes incrementally without forgetting previously learned knowledge. Current CISS methods typically use a knowledge distillation (KD) technique for preserving classifier logits, or freeze a feature extractor, to avoid the forgetting problem. The strong constraints, however, prevent learning discriminative features for novel classes. We introduce a CISS framework that alleviates the forgetting problem and facilitates learning novel classes effectively. We have found that a logit can be decomposed into two terms.
Decomposed Knowledge Distillation for Class-Incremental Semantic Segmentation Supplement
All numbers are obtained by averaging results over five runs with standard deviations in parenthesis. All numbers are also obtained by averaging results over five runs with standard deviations. We have empirically set α to 5 for all experiments. We can also see from Table S6 that our method is robust to various choices of β .
Decomposed Knowledge Distillation for Class-Incremental Semantic Segmentation
To this end, it is crucial to learn novel classes incrementally without forgetting previously learned knowledge. Current CISS methods typically use a knowledge distillation (KD) technique for preserving classifier logits, or freeze a feature extractor, to avoid the forgetting problem. The strong constraints, however, prevent learning discriminative features for novel classes. We introduce a CISS framework that alleviates the forgetting problem and facilitates learning novel classes effectively. We have found that a logit can be decomposed into two terms.